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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 



(i) APPLICANT: Gotschlich, Emil C. 



(ii) TITLE OF INVENTION: GLYCOS YLTRANS FERASES FOR BIOSYNTHESIS OF 
OLIGOSACCHARIDES , AND GENES^ ENCODING THEM 



(iii) NUMBER OF SEQUENCES 



DRESS : 



(iv) CORRESPONDENCE ADDRESS 

(A) ADDRESSEE: Klauber & Jackson 

(B) STREET: 411 Hackensack Avenue 

(C) CITY: Hackensack 

(D) STATE: New Jersey 

(E) COUNTRY: USA 

(F) ZIP: 07601 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: "PC -DOS /MS -DOS 

(D) SOFTWARE: Patent In Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/312,387 

(B) FILING DATE: September 26, 1994 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Jackson Esq., David A. 

(B) REGISTRATION NUMBER: 26,742 

(C) REFERENCE/DOCKET NUMBER: 600-1-095B 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 201 487-5800 

(B) TELEFAX: 201 343-1684 

(C) TELEX: 133521 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 5859 base "pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Neisseria gonorrheae 



BEST AVAILABLE COPY 
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(B) STRAIN: F62 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..381 

(C) GENE: glys (glycyl tRNA syntetase beta chain) 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 445.. 1491 

(C) GENE: IgtA 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 2342.. 3262 

(C) GENE: lgtC 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
- - - (B) LOCATION: 3322.. 4335 

(C) GENE: lgtD 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
H (B) LOCATION: 4354.. 5196 

P (C) GENE: lgtE 

a 

P (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

V 

CTG CAG GCC GTC GCC GTA TTC AAA CAA CTG CCC GAA GCC GCC GCG CTC 48 
Til Leu Gin Ala Val Ala Val Phe Lys Gin Leu Pro Glu Ala Ala Ala Leu 

p] 1 5 . 10 15 

N GCC GCC GCC AAC AAA CGC GTG CAA AAC CTG CTG AAA AAA GCC GAT GCC 96 

s Ala Ala Ala Asn Lys Arg Val Gin Asn Leu Leu Lys Lys Ala Asp Ala 

L.i 20 25 30 

I II GCG TTG GGC GAA GTC AAT GAA AGC CTG CTG CAA CAG GAC GAA GAA AAA 144 

Ala Leu Gly Glu Val Asn Glu Ser Leu Leu Gin Gin Asp Glu Glu Lys 

U 

P GCC CTG TAC GCT GCC GCG CAA GGT TTG CAG CCG AAA ATT GCC GCC GCC 192 

Li, Ala Leu Tyr Ala Ala Ala Gin Gly Leu Gin Pro Lys He Ala Ala Ala 

50 55 60 

GTC GCC GAA GGC AAT TTC CGA ACC GCC TTG TCC GAA CTG GCT TCC GTC 240 
Val Ala Glu Gly Asn Phe Arg Thr Ala Leu Ser Glu Leu Ala Ser Val 
65 70 75 80 

AAG CCG CAG GTT GAT GCC TTC TTC GAC GGC GTG ATG GTG ATG GCG GAA 288 
Lys Pro Gin Val Asp Ala Phe Phe Asp Gly Val Met Val Met Ala Glu 
85 90 95 

GAT GCC GCC GTA AAA CAA AAC CGC CTG AAC CTG CTG AAC CGC TTG GCA 336 
Asp Ala Ala Val Lys Gin Asn Arg Leu Asn Leu Leu Asn Arg Leu Ala 
100 105 110 

GAG CAG ATG AAC GCG GTG GCC GAC ATC GCG CTT TTG GGC GAG TAACCGTTGT 388 
Glu Gin Met Asn Ala Val Ala Asp He Ala Leu Leu Gly Glu 
115 120 125 
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ACAGTCCAAA TGCCGTCTGA AGCCTTCAGG CGGCATCAAA TTATCGGGAG AGTAAA 444 

TTG CAG CCT TTA GTC AGC GTA TTG ATT TGC GCC TAC AAC GTA GAA AAA 492 
Met Gin Pro Leu Val Ser Val Leu lie Cys Ala Tyr Asn Val Glu Lys 
15 10 15 

TAT TTT GCC CAA TCA TTA GCC GCC GTC GTG AAT CAG ACT TGG CGC AAC 540 
Tyr Phe Ala Gin Ser Leu Ala Ala Val Val Asn Gin Thr Trp Arg Asn 
20 25 30 

TTG GAT ATT TTG ATT GTC GAT GAC GGC TCG ACA GAC GGC ACA CTT GCC 588 
Leu Asp lie Leu lie Val Asp Asp Gly Ser Thr Asp Gly Thr Leu Ala 
35 40 45 

ATT GCC AAG GAT TTT CAA AAG CGG GAC AGC CGT ATC AAA ATC CTT GCA 636 
He Ala Lys Asp Phe Gin Lys Arg Asp Ser Arg He Lys He Leu Ala 
50 55 60 

CAA GCT CAA AAT TCC GGC CTG ATT CCC TCT TTA AAC ATC GGG CTG GAC 684 
• - - Gin Ala Gin Asn Ser Gly Leu He Pro Ser Leu Asn He Gly Leu Asp 
65 70 75 80 

GAA TTG GCA AAG TCG GGG GGG GGG GGG GGG GAA TAT ATT GCG CGC ACC 732 
Glu Leu Ala Lys Ser Gly Gly Gly Gly Gly Glu Tyr He Ala Arg Thr 

e , - 85 90 95 

fit* 

Q GAT GCC GAC GAT ATT GCC TCC CCC GGC TGG ATT GAG AAA ATC GTG GGC 780 . 

m Asp Ala Asp Asp He Ala Ser Pro Gly Trp He Glu Lys He Val Gly 

J;* 100 105 110 

CJ. 

%\ GAG ATG GAA AAA GAC CGC AGC ATC ATT GCG ATG GGC GCG TGG CTG GAA 82 8 

»« j Glu Met Glu Lys Asp Arg Ser He He Ala Met Gly Ala Trp Leu Glu 

HJ 115 120 125 

01 

%i GTT TTG TCG GAA GAA AAG GAC GGC AAC CGG CTG GCG CGG CAC CAC AAA 876 

■ Val Leu Ser Glu Glu Lys Asp Gly Asn Arg Leu Ala Arg His His Lys 

« 130 135 140 

f«f!; 

CAC GGC AAA ATT TGG AAA AAG CCG ACC CGG CAC GAA GAC ATC GCC GCC 924 
His Gly Lys He Trp Lys Lys Pro Thr Arg His Glu Asp He Ala Ala 
145 150 155 160 



ru 
p 

J||j TTT TTC CCT TTC GGC AAC CCC ATA CAC AAC AAC ACG ATG ATT ATG CGG 972 

M Phe Phe Pro Phe Gly Asn Pro He His Asn Asn Thr Met He Met Arg 

fa 165 170 175 

CGC AGC GTC ATT GAC GGC GGT TTG CGT TAC GAC ACC GAG CGG GAT TGG 1020 
Arg Ser Val He Asp Gly Gly Leu Arg Tyr Asp Thr Glu Arg Asp Trp 
180 .185 190 

GCG GAA GAT TAC CAA TTT TGG TAC GAT GTC AGC AAA TTG GGC AGG CTG 1068 
Ala Glu Asp Tyr Gin Phe Trp Tyr Asp Val Ser Lys Leu Gly Arg Leu 
195 200 205 

GCT TAT TAT CCC GAA GCC TTG GTC AAA TAC CGC CTT CAC GCC AAT CAG 1116 
Ala Tyr Tyr Pro Glu Ala Leu Val Lys Tyr Arg Leu His Ala Asn Gin 
210 215 ~ 220 

GTT TCA TCC AAA CAC AGC GTC CGC CAA CAC GAA ATC GCG CAA GGC ATC 1164 
Val Ser Ser Lys His Ser Val Arg Gin His Glu He Ala Gin Gly He 
225 230 235 240 
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CAA AAA ACC GCC AGA AAC GAT TTT TTG CAG TCT ATG GGT TTT AAA ACC 1212 
Gin Lys Thr Ala Arg Asn Asp Phe Leu Gin Ser Met Gly Phe Lys Thr 
245 250 " 255 

CGG TTC GAC AGC CTA GAA TAC CGC CAA ACA AAA GCA GCG GCG TAT GAA 1260 
Arg Phe Asp Ser Leu Glu Tyr Arg Gin Thr Lys Ala Ala Ala Tyr Glu 
260 265 270 

CTG CCG GAG AAG GAT TTG CCG GAA GAA GAT TTT GAA CGC GCC CGC CGG 1308 
Leu Pro Glu Lys Asp Leu Pro Glu Glu Asp Phe Glu Arg Ala Arg Arg 
275 * 280 285 

TTT TTG TAC CAA TGC TTC AAA CGG ACG GAC ACG CCG CCC TCC GGC GCG 1356 
Phe Leu Tyr Gin Cys Phe Lys Arg Thr Asp Thr Pro Pro Ser Gly Ala 
290 295 300 

TGG CTG GAT TTC GCG GCA GAC GGC AGG ATG AGG CGG CTG TTT ACC TTG 1404 
Trp Leu Asp Phe Ala Ala Asp Gly Arg Met Arg Arg Leu Phe Thr Leu 
305 310 . 315 320 

AGG CAA TAC TTC GGC ATT TTG TAC CGG CTG ATT AAA AAC CGC CGG CAG 1452 
Ar_g Gin Tyr Phe Gly lie Leu Tyr Arg Leu lie Lys Asn Arg Arg Gin 
325 330 ' 335 

GCG CGG TCG GAT TCG GCA GGG AAA GAA CAG GAG ATT TAATGCAAAA 1498 
H Ala Arg Ser Asp Ser Ala Gly Lys Glu Gin Glu He 

f"! 340 345 

a ' 
a 
\i 
ru 
m 
\i 

ru 
u 
u 

a 



CCACGTTATC 


AGCTTGGCTT 


CCGCCGCAGA 


ACGCAGGGCG 


CACATTGCCG 


CAACCTTCGG 


1558 


CAGTCGCGGC 


ATCCCGTTCC 


AGTTTTTCGA 


CGCACTGATG 


CCGTCTGAAA 


GGCTGGAACG 


1618 


GGCAATGGCG 


GAACTCGTCC 


CCGGCTTGTC 


GGCGCACCCC 


TATTTGAGCG 


GAGTGGAAAA 


1678 


AGCCTGCTTT 


ATGAGCCACG 


CCGTATTGTG 


GGAACAGGCA 


TTGGACGAAG 


GCGTACCGTA 


1738 


TATCGCCGTA 


TTTGAAGATG 


ATGTCTTACT 


CGGCGAAGGC 


GCGGAGCAGT 


TCCTTGCCGA 


1798 


AGATACTTGG 


CTGCAAGAAC 


GCTTTGACCC 


CGATTCCGCC 


TTTGTCGTCC 


GCTTGGAAAC 


1858 


GATGTTTATG 


CACGTCCTGA 


CCTCGCCCTC 


CGGCGTGGCG 


GACTACGGCG 


GGCGCGCCTT 


1918 


TCCGCTTTTG 


GAAAGCGAAC 


ACTGCGGGAC 


GGCGGGCTAT 


ATTATTTCCC 


GAAAGGCGAT 


1978 


GCGTTTTTTC 


TTGGACAGGT 


TTGCCGTTTT 


GCCGCCCGAA 


CGCCTGCACC 


CTGTCGATTT 


2038 


GATGATGTTC 


GGCAACCCTG 


ACGACAGGGA 


AGGAATGCCG 


GTTTGCCAGC 


TCAATCCCGC 


2098 


CTTGTGCGCC 


CAAGAGCTGC 


ATTATGCCAA 


GTTTCACGAC 


CAAAACAGCG 


CATTGGGCAG 


2158 


CCTGATCGAA 


CATGACCGCC 


GCCTGAACCG 


CAAACAGCAA 


TGGCGCGATT 


CCCCCGCCAA 


2218 


CACATTCAAA 


CACCGCCTGA 


TCCGCGCCTT 


GACCAAAATC 


GGCAGGGAAA 


GGGAAAAACG 


2278 


CCGGCAAAGG 


CGCGAACAGT 


TAATCGGCAA 


GATTATTGTG 


CCTTTCCAAT 


AAAAGGAGAA 


2338 


AAG ATG GAC ATC GTA TTT GCG GCA GAC GAC AAC 
Met Asp He Val Phe Ala Ala Asp Asp Asn 
15 10 


TAT GCC GCC TAC CTT 
Tyr Ala Ala Tyr Leu 
15 


2386 



TGC GTT GCG GCA AAA AGC GTG GAA GCG GCC CAT CCC GAT ACG GAA ATC 2434 
Cys Val Ala Ala Lys Ser Val Glu Ala Ala His Pro Asp Thr Glu He 
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20 25 30 



AGG TTC CAC GTC CTC GAT GCC GGC ATC AGT GAG GAA AAC CGG GCG GCG 2482 
Arg Phe His Val Leu Asp Ala Gly lie Ser Glu Glu Asn Arg Ala Ala 
35 40 45 

GTT GCC GCC AAT TTG CGG GGG GGG GGT AAT ATC CGC TTT ATA GAC GTA 2530 
Val Ala Ala Asn Leu Arg Gly Gly Gly Asn He Arg Phe He Asp Val 
50 55 60 

AAC CCC GAA GAT TTC GCC GGC TTC CCC TTA AAC ATC AGG CAC ATT TCC 2578 
Asn Pro Glu Asp Phe Ala Gly Phe Pro Leu Asn He Arg His He Ser 
65 70 75 

ATT ACG ACT TAT GCC CGC CTG AAA TTG GGC GAA TAC ATT GCC GAT TGC 2626 
He Thr Thr Tyr Ala Arg Leu Lys Leu Gly Glu Tyr He Ala Asp Cys 
80 85 90 95 

GAC AAA GTC CTG TAT CTG GAT ACG GAC GTA TTG GTC AGG GAC GGC CTG 2674 
• ' - Asp Lys Val Leu Tyr Leu Asp Thr Asp Val Leu Val Arg Asp Gly Leu 

100 105 ~ 110 

AAG CCC TTA TGG GAT ACC GAT TTG " GGC GGT AAC TGG GTC GGC GCG TGC 2 722 

Lys Pro Leu Trp Asp Thr Asp Leu Gly Gly Asn Trp Val Gly Ala Cys 
M 115 12 ° 125 

0 ATC GAT TTG TTT GTC GAA AGG CAG GAA GGA TAC AAA CAA AAA ATC GGT 2770 

f*i He Asp Leu Phe Val Glu Arg Gin Glu Gly Tyr Lys Gin Lys He Gly 

ml 130 135 * 14 0 

M 

\\ ATG GCG GAC GGA GAA TAT TAT TTC AAT GCC GGC GTA TTG CTG ATC AAC 2818 

f[] Met Ala Asp Gly Glu Tyr Tyr Phe Asn Ala Gly Val Leu Leu He Asn 

L" 145 150 155 

\J CTG AAA AAG TGG CGG CGG CAC GAT ATT TTC AAA ATG TCC TGC GAA TGG 2866 

Leu Lys Lys Trp Arg Arg His Asp He Phe Lys Met Ser Cys Glu Trp 
f 160 165 170 " 175 

GTG GAA CAA TAC AAG GAC GTG ATG CAA TAT CAG GAT CAG GAC ATT TTG 2914 
LJ? Val Glu Gin Tyr Lys Asp Val Met Gin Tyr Gin Asp Gin Asp He Leu 

W 180 185 190 

M ' 195 



AAC GGG CTG TTT AAA GGC GGG GTG TGT TAT GCG AAC AGC CGT TTC AAC 2 962 

Asn Gly Leu Phe Lys Gly Gly Val Cys Tyr Ala Asn Ser Arg Phe Asn 

200 205 



TTT ATG CCG ACC AAT TAT GCC TTT ATG GCG AAC GGG TTT GCG TCC CGC 3010 
Phe Met Pro Thr Asn Tyr Ala Phe Met Ala Asn Gly Phe Ala Ser Arg 
210 215. 220 

CAT ACC GAC CCG CTT TAC CTC GAC CGT ACC AAT ACG GCG ATG CCC GTC 3058 
His Thr Asp Pro Leu Tyr Leu Asp Arg Thr Asn Thr Ala Met Pro Val 
225 230 235 

GCC GTC AGC CAT TAT TGC GGC TCG GCA AAG CCG TGG CAC AGG GAC TGC 3106 
Ala Val Ser His Tyr Cys Gly Ser Ala Lys Pro Trp His Arg Asp Cys 
240 245 250 " 255 

ACC GTT TGG GGT GCG GAA CGT TTC ACA GAG TTG GCC GGC AGC CTG ACG 3154 
Thr Val Trp Gly Ala Glu Arg Phe Thr Glu Leu Ala Gly Ser Leu Thr 
260 265 * 270 
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ACC GTT CCC GAA GAA TGG CGC GGC AAA CTT GCC GTC CCG CCG ACA AAG 3202 
Thr Val Pro Glu Glu Trp Arg Gly Lys Leu Ala Val Pro Pro Thr Lys 
275 280 285 

TGT ATG CTT CAA AGA TGG CGC AAA AAG CTG TCT GCC AGA TTC TTA CGC 3250 
Cys Met Leu Gin Arg Trp Arg Lys Lys Leu Ser Ala Arg Phe Leu Arg 
290 295 300 

AAG ATT TAT TGACGGGGCA GGCCGTCTGA AGCCTTCAGA CGGCATCGGA 3299 
Lys lie Tyr 
305 

CGTATCGGAA AGGAGAAACG GA TTG CAG CCT TTA GTC AGC GTA TTG ATT TGC 3351 

Met Gin Pro Leu Val Ser Val Leu He Cys 
15 10 

GCC TAC AAC GCA GAA AAA TAT TTT GCC CAA TCA TTG GCC GCC GTA GTG 3399 
Ala Tyr Asn Ala Glu Lys Tyr Phe Ala Gin Ser Leu Ala Ala Val Val 
15 20 25 

GGG CAG ACT TGG CGC AAC TTG GAT ATT TTG ATT GTC GAT GAC GGC TCG 3447 
Gly Gin Thr Trp Arg Asn Leu Asp He Leu He Val Asp Asp Gly Ser 
30 " 35 40 

ACG GAC GGC ACG CCC GCC ATT GCC CGG CAT TTC CAA GAA CAG GAC GGC 3495 
M Thr Asp Gly Thr Pro Ala He Ala Arg His Phe Gin Glu Gin Asp Gly 

P 45 50 55 

Q AGG ATC AGG ATA ATT TCC AAT CCC CGC AAT TTG GGC TTT ATC GCC TCT 3543 

jjjj Arg He Arg lie He Ser Asn Pro Arg Asn Leu Gly Phe He Ala Ser 

hi 60 65 70 

* } * 

f\l TTA AAC ATC GGG CTG GAC GAA TTG GCA AAG TCG GGG GGG GGG GAA TAT 35 91 

ff] Leu Asn He Gly Leu Asp Glu Leu Ala Lys Ser Gly Gly Gly Glu Tyr 

75 80 85 90 

ATT GCG CGC ACC GAT GCC GAC GAT ATT GCC TCC CCC GGC TGG ATT GAG 3639 
He Ala Arg Thr Asp Ala Asp Asp He Ala Ser Pro Gly Trp He Glu 
95 100 105 



•J 



ru • 

fj AAA ATC GTG GGC GAG ATG GAA AAA GAC CGC AGC ATC ATT GCG ATG GGC 3687 

l'l Lys He Val Gly Glu Met Glu Lys Asp Arg Ser He He Ala Met Gly 

w 110 115 120 

P 

U GCG TGG TTG GAA GTT TTG TCG GAA GAA AAC AAT AAA AGC GTG CTT GCC 3735 

Ala Trp Leu Glu Val Leu Ser Glu Glu Asn Asn Lys Ser Val Leu Ala 
125 130 135 

GCC ATT GCC CGA AAC GGC GCA ATT TGG GAC AAA CCG ACC CGG CAT GAA 3783 
Ala He Ala Arg Asn Gly Ala Ile'Trp Asp Lys Pro Thr Arg His Glu 
140 145 150 

GAC ATT GTC GCC GTT TTC CCT TTC GGC AAC CCC ATA CAC AAC AAC ACG 3831 
Asp He Val Ala Val Phe Pro Phe Gly Asn Pro He His Asn Asn Thr 
155 160 165 170 

ATG ATT ATG AGG CGC AGC GTC ATT GAC GGC GGT TTG CGG TTC GAT CCA 3879 
Met He Met Arg Arg Ser Val He Asp Gly Gly Leu Arg Phe Asp Pro 
175 180 185 

GCC TAT ATC CAC GCC GAA GAC TAT AAG TTT TGG TAC GAA GCC GGC AAA 3927 
Ala Tyr He His Ala Glu Asp Tyr Lys Phe Trp Tyr Glu Ala Gly Lys 
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190 195 200 



CTG GGC AGG CTG GCT TAT TAT CCC GAA GCC TTG GTC AAA TAC CGC TTC 3975 
Leu Gly Arg Leu Ala Tyr Tyr Pro Glu Ala Leu Val Lys Tyr Arg Phe 
205 210 215 

CAT CAA GAC CAG ACT TCT TCC AAA TAC AAC CTG CAA CAG CGC AGG ACG 4023 
His Gin Asp Gin Thr Ser Ser Lys Tyr Asn Leu Gin Gin Arg Arg Thr 
220 225 230 

GCG TGG AAA ATC AAA GAA GAA ATC AGG GCG GGG TAT TGG AAG GCG GCA 4071 
Ala Trp Lys lie Lys Glu Glu lie Arg Ala Gly Tyr Trp Lys Ala Ala 
235 240 245 250 

GGC ATA GCC GTC GGG GCG GAC TGC CTG AAT TAC GGG CTT TTG AAA TCA 4119 
Gly lie Ala Val Gly Ala Asp Cys Leu Asn Tyr Gly Leu Leu Lys Ser 
255 260 265 

ACG GCA TAT GCG TTG TAC GAA AAA GCC TTG TCC GGA CAG GAT ATC GGA 4167 
- - - Thr Ala Tyr Ala Leu Tyr Glu Lys Ala Leu Ser Gly Gin Asp lie Gly 

270 275 280 

TGC CTC CGC CTG TTC CTG TAC GAA "TAT TTC TTG TCG TTG GAA AAG TAT 4215 

Cys Leu Arg Leu Phe Leu Tyr Glu Tyr Phe Leu Ser Leu Glu Lys Tyr 
= , "* 285 290 295 

□ TCT TTG ACC GAT TTG CTG GAT TTC TTG ACA GAC CGC GTG ATG AGG AAG 4263 

f'1 Ser Leu Thr Asp Leu Leu Asp Phe Leu Thr Asp Arg Val Met Arg Lys 

j£j 300 305 310 

\\ CTG TTT GCC GCA CCG CAA TAT AGG AAA ATC CTG AAA AAA ATG TTA CGC 4311 

pij Leu Phe Ala Ala Pro Gin Tyr Arg Lys lie Leu Lys Lys Met Leu Arg 

*!? 315 320 325 330 

CU 

\\ CCT TGG AAA TAC CGC AGC TAT TGAAACCGAA CAGGATAAAT C ATG CAA AAC 4362 

Pro Trp Lys Tyr Arg Ser Tyr Met Gin Asn 

^335 1 



CAC GTT ATC AGC TTG GCT TCC GCC GCA GAG CGC AGG GCG CAC ATT GCC 4410 
His Val lie Ser Leu Ala Ser Ala Ala Glu Arg Arg Ala His lie Ala 
5 10 15 



m 
a 
u 

«» GAT ACC TTC GGC AGT CGC GGC ATC CCG TTC CAG TTT TTC GAC GCA CTG 4458 

*'* Asp Thr Phe Gly Ser Arg Gly lie Pro Phe Gin Phe Phe Asp Ala Leu 

M 20 25 30 35 

ATG CCG TCT GAA AGG CTG GAA CAG GCG ATG GCG GAA CTC GTC CCC GGC 4506 
Met Pro Ser Glu Arg Leu Glu Gin Ala Met Ala Glu Leu Val Pro Gly 
40 45 50 

TTG TCG GCG CAC CCC TAT TTG AGC GGA GTG GAA AAA GCC TGC TTT ATG 4554 
Leu Ser Ala His Pro Tyr Leu Ser Gly Val Glu Lys Ala Cys Phe Met 
55 60 65 

AGC CAC GCC GTA TTG TGG GAA CAG GCG TTG GAT GAA GGT CTG CCG TAT 4602 
Ser His Ala Val Leu Trp Glu Gin Ala Leu Asp Glu Gly Leu Pro Tyr 
70 75 80 

ATC GCC GTA TTT GAG GAC GAC GTT TTA CTC GGC GAA GGC GCG GAG CAG 4650 
lie Ala Val Phe Glu Asp Asp Val Leu Leu Gly Glu Gly Ala Glu Gin 
85 90 95 




a 

O 

a 
si 
ru 

cn 
N 

a 

ru 
a 
w 
o 
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TTC CTT GCC GAA GAT ACT TGG TTG GAA GAG CGT TTT GAC AAG GAT TCC 4698 
Phe Leu Ala Glu Asp Thr Trp Leu Glu Glu Arg Phe Asp Lys Asp Ser 
100 105 110 115 

GCC TTT ATC GTC CGT TTG GAA ACG ATG TTT GCG AAA GTT ATT GTC AGA 4746 
Ala Phe lie Val Arg Leu Glu Thr Met Phe Ala Lys Val lie Val Arg 
120 125 130 

CCG GAT AAA GTC CTG AAT TAT GAA AAC CGG TCA TTT CCT TTG CTG GAG 4794 
Pro Asp Lys Val Leu Asn Tyr Glu Asn Arg Ser Phe Pro Leu Leu Glu 
135 140 145 

AGC GAA CAT TGT GGG ACG GCT GGC TAT ATC ATT TCG CGT GAG GCG ATG 4842 
Ser Glu His Cys Gly Thr Ala Gly Tyr lie lie Ser Arg Glu Ala Met 
150 155 160 

CGG TTT TTC TTG GAC AGG TTT GCC GTT TTG CCG CCA GAG CGG ATT AAA 4890 
Arg Phe Phe Leu Asp Arg Phe Ala Val Leu Pro Pro Glu Arg He Lys 
165 170 175 

GCG GTA GAT TTG ATG ATG TTT ACT TAT TTC TTT GAT AAG GAG GGG ATG 4938 
Ala Val Asp Leu Met Met Phe Thr Tyr Phe Phe Asp Lys Glu Gly Met 
180 185 190 195 

CCT GTT TAT CAG GTT AGT CCC GCC TTA TGT ACC CAA GAA TTG CAT TAT 4986 
Pro Val Tyr Gin Val Ser Pro Ala Leu Cys Thr Gin Glu Leu His Tyr 
200 205 210 

GCC AAG TTT CTC AGT CAA AAC AGT ATG TTG GGT AGC GAT TTG GAA AAA 5034 
Ala Lys Phe Leu Ser Gin Asn Ser Met Leu Gly Ser Asp Leu Glu Lys 
215 220 , 225 

GAT AGG GAA CAA GGA AGA AGA CAC CGC CGT TCG TTG AAG GTG ATG TTT 5082 
Asp Arg Glu Gin Gly Arg Arg His Arg Arg Ser Leu Lys Val Met Phe 
230 235 240 

GAC TTG AAG CGT GCT TTG GGT AAA TTC GGT AGG GAA AAG AAG AAA AGA 5130 
Asp Leu Lys Arg Ala Leu Gly Lys Phe Gly Arg Glu Lys Lys Lys Arg 
245 250 255 

ATG GAG CGT CAA AGG CAG GCG GAG CTT GAG AAA GTT TAC GGC AGG CGG 5178 
Met Glu Arg Gin Arg Gin Ala Glu Leu Glu Lys Val Tyr Gly Arg Arg 
260 265 270 * 275 

GTC ATA TTG TTC AAA TAGTTTGTGT AAAATATAGG GGATTAAAAT CAGAAATGGA 5233 
Val He Leu Phe Lys 
280 



CACACTGTCA 


TTCCCGCGCA 


GGCGGGAATC 


TAGGTCTTTA 


AACTTCGGTT 


TTTTCCGATA 


5293 


AATTCTTGCC 


GCATTAAAAT 


TCCAGATTCC 


CGCTTTCGCG 


GGGATGACGG 


CGGGGGGATT 


5353 


GTTGCTTTTT 


CGGATAAAAT 


CCCGTGTTTT 


TTCATCTGCT 


AGGTAAAATC 


GCCCCAAAGC 


5413 


GTCTGCATCG 


CGGCGATGGC 


GGCGAGTGGG 


GCGGTTTCTG 


TGCGTAAAAT 


CCGTTTTCCG 


5473 


AGTGTAACCG 


CCTGAAAGCC 


GGCTTCAAAT 


GCCTGTTGTT 


CTTCCTGTTC 


TGTCCAGCCG 


5533 


CCTTCGGGCC 


CGACCATAAA 


GACGATTGCG 


CCGGACGGGT 


GGCGGATGTC 


GCCGAGTTTG 


5593 



CAGGCGCGGT TGATGCTCAT AATCAGCTTG GTGTTTTCAG ACGGCATTTT GTCGAGTGCT 5653 



a 
a 
a 
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TCACGGTAGC CGATGATGGG CAGTACGGGG GGAACGGTGT TCCTGCCGCT TTGTTCGCAC 5713 
GCGGAGATGA CGATTTCCTG CCAGCGTGCG AGGCGTTTGG CGGCGCGTTC TCCGTCGAGG 5773 
CGGACGATGC AGCGTTCGCT GATGACGGGC TGTATGGCGG TTACGCCGAG TTCGACGCTT 5833 
TTTTGCAGGG TCAAAX ccAT GCGATC 5859 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 126 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Leu Gin Ala Val Ala Val Phe Lys Gin Leu Pro Glu Ala Ala Ala Leu 
-1 5 10 15 

Ala Ala Ala Asn Lys Arg Val Gin Asn Leu Leu Lys Lys Ala Asp Ala 
20 25 30 

Ala Leu Gly Glu Val Asn Glu Ser Leu Leu Gin Gin Asp Glu Glu Lys 
35 40 45 



>. ; Ala Leu Tyr Ala Ala Ala Gin Gly Leu Gin Pro Lys lie Ala Ala Ala 

'<! 50 55 60 

ru 

AFfs Val Ala Glu Gly Asn Phe Arg Thr Ala Leu Ser Glu Leu Ala Ser Val 

65 70 75 80 

, } Lys Pro Gin Val Asp Ala Phe Phe Asp Gly Val Met Val Met Ala Glu 

fjj Asp Ala Ala Val Lys Gin Asn Arg Leu Asn Leu Leu Asn Arg Leu Ala 

p| 100 105 110 

U Glu Gin Met Asn Ala Val Ala Asp He Ala Leu Leu Gly Glu 

0 115 120 125 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 348 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Gin Pro Leu Val Ser Val Leu He Cys Ala Tyr Asn Val Glu Lys 
1 5 10 ' 15 

Tyr Phe Ala Gin Ser Leu Ala Ala Val Val Asn Gin Thr Trp Arg Asn 
20 25 30 



Leu Asp lie 
35 

lie Ala Lys 
50 

Gin Ala Gin 
65 

Glu Leu Ala 

Asp Ala Asp 

Glu Met Glu 
115 

Val Leu Ser 
130 

His Gly Lys 
145 

Phe Phe Pro 

Arg Ser Val 

Ala Glu Asp 
195 

Ala Tyr Tyr 
210 

Val Ser Ser 
225 

Gin Lys Thr 

Arg Phe Asp 

Leu Pro Glu 
275 

Phe Leu Tyr 
290 

Trp Leu Asp 
305 

Arg Gin Tyr 
Ala Arg Ser 



Leu lie Val Asp Asp Gly Ser Thr Asp Gly Thr Leu Ala 
40 45 

Asp Phe Gin Lys Arg Asp Ser Arg lie Lys lie Leu Ala 
55 60 

Asn Ser Gly Leu lie Pro Ser Leu Asn lie Gly Leu Asp 
70 75 80 

Lys Ser Gly Gly Gly Gly Gly Glu Tyr lie Ala Arg Thr 
85 90 95 

Asp lie Ala Ser Pro Gly Trp lie Glu Lys lie Val Gly 
100 105 110 

Lys Asp Arg Ser lie lie Ala Met Gly Ala Trp Leu Glu 
120 125 

Glu Glu Lys Asp Gly Asn Arg Leu Ala Arg His His Lys 
135 140 

lie Trp Lys Lys Pro Thr Arg His Glu Asp lie Ala Ala 
150 155 160 

Phe Gly Asn Pro lie His Asn Asn Thr Met He Met Arg 
165 170 175 

He Asp Gly Gly Leu Arg Tyr Asp Thr Glu Arg Asp Trp 
180 185 190 

Tyr Gin Phe Trp Tyr Asp Val Ser Lys Leu Gly Arg Leu 
200 205 

Pro Glu Ala Leu Val Lys Tyr Arg Leu His Ala Asn Gin 
215 220 

Lys His Ser Val Arg Gin His Glu He Ala Gin Gly He 
230 235 240 

Ala Arg Asn Asp Phe Leu Gin Ser Met Gly Phe Lys Thr 
245 ^ 250 255 

Ser Leu Glu Tyr Arg Gin Thr Lys Ala Ala Ala Tyr Glu 
260 " 265 270 

Lys Asp Leu Pro Glu Glu Asp Phe Glu Arg Ala Arg Arg 
280 285 

Gin Cys Phe Lys Arg Thr Asp Thr Pro Pro Ser Gly Ala 
295 300 

Phe Ala Ala Asp Gly Arg Met Arg Arg Leu Phe Thr Leu 
310 315 320 

Phe Gly He Leu Tyr Arg Leu He Lys Asn Arg Arg Gin 
325 330 335 

Asp Ser Ala Gly Lys Glu Gin Glu He 
340 345 



m 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 306 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

Met Asp lie Val Phe Ala Ala Asp Asp Asn Tyr Ala Ala Tyr Leu Cys 
1 5 10 15 

Val Ala Ala Lys Ser Val Glu Ala Ala His Pro Asp Thr Glu lie Arg 
20 25 30 

Phe His Val Leu Asp Ala Gly lie Ser Glu Glu Asn Arg Ala Ala Val 
35 40 45 

Ala Ala Asn Leu Arg Gly Gly Gly Asn lie Arg Phe lie Asp Val Asn 

* _ _ .50 55 60 

Pro Glu Asp Phe Ala Gly Phe Pro Leu Asn lie Arg His lie Ser lie 
65 70 75 ~ 80 

m Thr Thr Tyr Ala Arg Leu Lys Leu Gly Glu Tyr He Ala Asp Cys Asp 

X 85 90 95 

M 

f l \ Lys Val Leu Tyr Leu Asp Thr Asp Val Leu Val Arg Asp Gly Leu Lys 

100 105 110 

fiJ Pro Leu Trp Asp Thr Asp Leu Gly Gly Asn Trp Val Gly Ala Cys He 

f\\ 115 120 125 

Asp Leu Phe Val Glu Arg Gin Glu Gly Tyr Lys Gin Lys He Gly Met 
u 130 135 140 

* Ala Asp Gly Glu Tyr Tyr Phe Asn Ala Gly Val Leu Leu He Asn Leu 

hi 145 150 155 160 

□ 

^ , Lys Lys Trp Arg Arg His Asp He Phe Lys Met Ser Cys Glu Trp Val 

165 170 175 



Glu Gin Tyr Lys Asp Val Met Gin Tyr Gin Asp Gin Asp He Leu Asn 

180 185' 190 

Gly Leu Phe Lys Gly Gly Val Cys Tyr Ala Asn Ser Arg Phe Asn Phe 
195 200 _ 205 

Met Pro Thr Asn Tyr Ala Phe Met Ala Asn Gly Phe Ala Ser Arg His 
210 215 220 

Thr Asp Pro Leu Tyr Leu Asp Arg Thr Asn Thr Ala Met Pro Val Ala 

225 230 235 240 

Val Ser His Tyr Cys Gly Ser Ala Lys Pro Trp His Arg Asp Cys Thr 

245 250 255 

Val Trp Gly Ala Glu Arg Phe Thr Glu Leu Ala Gly Ser Leu Thr Thr 

260 ' 265 270 
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Val Pro Glu Glu Trp Arg Gly Lys Leu Ala Val Pro Pro Thr Lys Cys 
275 280 285 

Met Leu Gin Arg Trp Arg Lys Lys Leu Ser Ala Arg Phe Leu Arg Lys 
290 ~ 295 300 

He Tyr 
305 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 337 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Gin Pro Leu Val Ser Val Leu lie Cys Ala Tyr Asn Ala Glu Lys 
15 10 15 

h* Tyr Phe Ala Gin Ser Leu Ala Ala Val Val Gly Gin Thr Trp Arg Asn 

O 20 25 30 

W Leu Asp He Leu He Val Asp Asp Gly Ser Thr Asp Gly Thr Pro Ala 

O 35 40 45 

He Ala Arg His Phe Gin Glu Gin Asp Gly Arg He Arg He lie Ser 

Ml 50 55 60 

^ I Asn Pro Arg Asn Leu Gly Phe He Ala Ser Leu Asn He Gly Leu Asp 

« 65 " 70 75 80 



y, Glu Leu Ala Lys Ser Gly Gly Gly Glu Tyr He Ala Arg Thr Asp Ala 

IV 

Q Asp Asp He Ala Ser Pro Gly Trp He Glu Lys He Val Gly Glu Met 

jj.! 100 105 110 

O Glu Lys Asp Arg Ser He He Ala Met Gly Ala Trp Leu Glu Val Leu 

^ 115 ~ 120 125 

Ser Glu Glu Asn Asn Lys Ser Val Leu Ala Ala He Ala Arg Asn Gly 
130 135 140 

Ala He Trp Asp Lys Pro Thr Arg~His Glu Asp He Val Ala Val Phe 
145 150 155 160 

Pro Phe Gly Asn Pro He His Asn Asn Thr Met He Met Arg Arg Ser 
165 170 175 

Val He Asp Gly Gly Leu Arg Phe Asp Pro Ala Tyr lie His Ala Glu 
180 * 185 190 

Asp Tyr Lys Phe Trp Tyr Glu Ala Gly Lys Leu Gly Arg Leu Ala Tyr 
195 200 205 

Tyr Pro Glu Ala Leu Val Lys Tyr Arg Phe His Gin Asp Gin Thr Ser 



0 in 
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210 215 220 

Ser Lys Tyr Asn Leu Gin Gin Arg Arg Thr Ala Trp Lys lie Lys Glu 
225 " * 230 235 240 

Glu He Arg Ala Gly Tyr Trp Lys Ala Ala Gly He Ala Val Gly Ala 
245 250 255 

Asp Cys Leu Asn Tyr Gly Leu Leu Lys Ser Thr Ala Tyr Ala Leu Tyr 
260 265 270 

Glu Lys Ala Leu Ser Gly Gin Asp He Gly Cys Leu Arg Leu Phe Leu 
275 280 285 

Tyr Glu Tyr Phe Leu Ser Leu Glu Lys Tyr Ser Leu Thr Asp Leu Leu 
290 295 300 

Asp Phe Leu Thr Asp Arg Val Met Arg Lys Leu Phe Ala Ala Pro Gin 
305 310 315 320 

Tyr Arg Lys He Leu Lys Lys Met Leu Arg Pro Trp Lys Tyr Arg Ser 

. 325 _. 330 335 

Tyr 



(2) INFORMATION FOR SEQ ID NO: 6: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 280 amino acids 
f}j (B) TYPE: amino acid 

P (D) TOPOLOGY: linear 

H (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

jPU Met Gin Asn His Val He Ser Leu Ala Ser Ala Ala Glu Arg Arg Ala 

f Q 1 5 10 15 

Ml His He Ala Asp Thr Phe Gly Ser Arg Gly He Pro Phe Gin Phe Phe 

Q 20 25 30 

Lib 

5 Asp Ala Leu Met Pro Ser Glu Arg Leu 'Glu Gin Ala Met Ala Glu Leu 

35 40 45 

Val Pro Gly Leu Ser Ala His Pro Tyr Leu Ser Gly Val Glu Lys Ala 
50 55 60 

Cys Phe Met Ser His Ala Val Leu Trp Glu Gin Ala Leu Asp Glu Gly 
65 70 75 80 

Leu Pro Tyr He Ala Val Phe Glu Asp Asp Val Leu Leu Gly Glu Gly 
^ 85 90 95 

Ala Glu Gin Phe Leu Ala Glu Asp Thr Trp Leu Glu Glu Arg Phe Asp 
100 ~ 105 110 

Lys Asp Ser Ala Phe He Val Arg Leu Glu Thr Met Phe Ala Lys Val 
115 120 125 
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He Val Arg Pro Asp Lys Val Leu Asn Tyr Glu Asn Arg Ser Phe Pro 
130 135 140 

Leu Leu Glu Ser Glu His Cys Gly Thr Ala Gly Tyr He He Ser Arg 
145 150 155 160 

Glu Ala Met Arg Phe Phe Leu Asp Arg Phe Ala Val Leu Pro Pro Glu 
165 170 175 

Arg He Lys Ala Val Asp Leu Met Met Phe Thr Tyr Phe Phe Asp Lys 
180 185 190 

Glu Gly Met Pro Val Tyr Gin Val Ser Pro Ala Leu Cys Thr Gin Glu 
195 200 205 

Leu His Tyr Ala Lys Phe Leu Ser Gin Asn Ser Met Leu Gly Ser Asp 
210 215 220 

Leii Glu Lys Asp Arg Glu Gin Gly Arg Arg His Arg Arg Ser Leu Lys 
- - - 225 230 ' " 235 ~ 240 

Val Met Phe Asp Leu Lys Arg Ala Leu Gly Lys Phe Gly Arg Glu Lys 
245 250 255 

Lys Lys Arg Met Glu Arg Gin Arg Gin Ala Glu Leu Glu Lys Val Tyr 

>U 260 265 270 

Gly Arg Arg Val He Leu Phe Lys 

U 275 280 

l k \ (2) INFORMATION FOR SEQ ID NO: 7: 

ill 

m (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5859 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: unknown 



fjl (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 



Ej (iv) ANTI- SENSE: NO 

^ (vi) ORIGINAL SOURCE: 

(A) ORGANISM: Neisseria gonorrheae 

(B) STRAIN: F62 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1491.. 2330 

(C) GENE: lgtB 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 7: 

CTGCAGGCCG TCGCCGTATT CAAACAACTG CCCGAAGCCG CCGCGCTCGC CGCCGCCAAC 60 

AAACGCGTGC AAAACCTGCT GAAAAAAGCC GATGCCGCGT TGGGCGAAGT CAATGAAAGC 120 

CTGCTGCAAC AGGACGAAGA AAAAGCCCTG TACGCTGCCG CGCAAGGTTT GCAGCCGAAA 180 
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o 
a 
a 

N 

ry 

Si 
M 



0 

a 



ATTGCCGCCG 


CCGTCGCCGA 


AGGCAATTTC 


CGAACCGCCT 


TGTCCGAACT 


GGCTTCCGTC 


240 


AAGCCGCAGG 


TTGATGCCTT 


CTTCGACGGC 


GTGATGGTGA 


TGGCGGAAGA 


TGCCGCCGTA 


300 


AAACAAAACC 


GCCTGAACCT 


GCTGAACCGC 


TTGGCAGAGC 


AGATGAACGC 


GGTGGCCGAC 


360 


ATCGCGCTTT 


TGGGCGAGTA 


ACCGTTGTAC 


AGTCCAAATG 


CCGTCTGAAG 


CCTTCAGGCG 


420 


GCATCAAATT 


ATCGGGAGAG 


TAAATTGCAG 


CCTTTAGTCA 


GCGTATTGAT 


TTGCGCCTAC 


480 


AACGTAGAAA 


AATATTTTGC 


CCAATCATTA 


GCCGCCGTCG 


TGAATCAGAC 


TTGGCGCAAC 


540 


TTGGATATTT 


TGATTGTCGA 


TGACGGCTCG 


ACAGACGGCA 


CACTTGCCAT 


TGCCAAGGAT 


600 


TTTCAAAAGC 


GGGACAGCCG 


TATCAAAATC 


CTTGCACAAG 


CTCAAAATTC 


CGGCCTGATT 


660 


CCCTCTTTAA 


ACATCGGGCT 


GGACGAATTG 


GCAAAGTCGG 


GGGGGGGGGG 


GGGGGAATAT 


720 


ATTGCGCGCA 


CCGATGCCGA 


CGATATTGCC 


TCCCCCGGCT 


GGATTGAGAA 


AATCGTGGGC 


780 


GAGATGGAAA 


AAGACCGCAG 


CATCATTGCG 


ATGGGCGCGT 


GGCTGGAAGT 


TTTGTCGGAA 


840 


GAAAAGGACG 


GCAACCGGCT 


GGCGCGGCAC 


CACAAACACG 


GCAAAATTTG 


GAAAAAGCCG 


900 


ACCCGGCACG 


AAGACATCGC 


CGCCTTTTTC 


CCTTTCGGCA 


ACCCCATACA 


CAACAACACG 


960 


ATGATTATGC 


GGCGCAGCGT 


CATTGACGGC 


GGTTTGCGTT 


ACGACACCGA 


GCGGGATTGG 


1020 


GCGGAAGATT 


ACCAATTTTG 


GTACGATGTC 


AGCAAATTGG 


GCAGGCTGGC 


TTATTATCCC 


N 1080 


GAAGCCTTGG 


TCAAATACCG 


CCTTCACGCC 


AATCAGGTTT 


CATCCAAACA 


CAGCGTCCGC 


1140 


CAACACGAAA 


TCGCGCAAGG 


CATCCAAAAA ACCGCCAGAA 


ACGATTTTTT 


GCAGTCTATG 


1200 


GGTTTTAAAA 


CCCGGTTCGA 


CAGCCTAGAA 


TACCGCCAAA 


CAAAAGCAGC 


GGCGTATGAA 


1260 


CTGCCGGAGA 


AGGATTTGCC 


GGAAGAAGAT 


TTTGAACGCG 


CCCGCCGGTT 


TTTGTACCAA 


1320 


TGCTTCAAAC 


GGACGGACAC 


GCCGCCCTCC 


GGCGCGTGGC 


TGGATTTCGC 


GGCAGACGGC 


1380 


AGGATGAGGC 


GGCTGTTTAC 


CTTGAGGCAA 


TACTTCGGCA 


TTTTGTACCG 


GCTGATTAAA 


1440 


AACCGCCGGC 


AGGCGCGGTC 


GGATTCGGCA 


GGGAAAGAAC 


AGGAGATTTA 


ATG CAA 
Met Gin 


1496 



AAC CAC GTT ATC AGC TTG GCT TCC GCC GCA GAA 
Asn His Val lie Ser Leu Ala Ser Ala Ala Glu 
5 10 m 

GCC GCA ACC TTC GGC AGT CGC GGC ATC CCG TTC 
Ala Ala Thr Phe Gly Ser Arg Gly lie Pro Phe 
20 * 25 

CTG ATG CCG TCT GAA AGG CTG GAA CGG GCA ATG 
Leu Met Pro Ser Glu Arg Leu Glu Arg Ala Met 
35 40 45 

GGC TTG TCG GCG CAC CCC TAT TTG AGC GGA GTG 
Gly Leu Ser Ala His Pro Tyr Leu Ser Gly Val 
55 60 



CGC AGG GCG CAC ATT 
Arg Arg Ala His lie 
15 

CAG TTT TTC GAC GCA 
Gin Phe Phe Asp Ala 
30 

GCG GAA CTC GTC CCC 
Ala Glu Leu Val Pro 
50 

GAA AAA GCC TGC TTT 
Glu Lys Ala Cys Phe 
65 



1544 



1592 



1640 



1688 
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a 



ru 
m 
si 

n 

ru 
a 
hi 
o 



ATG 
Met 


AGC 

Ser 


His 


Ala 
70 


GIA 

Val 


TTG 
Leu 


1GG 

Trp 


TV 7V 

GAA 

Glu 


CAG GCA 
Gin Ala 
75 


TTG 
Leu 


GAC 
Asp 


GAA 
Glu 


GGC 
Gly 
80 


GTA 
Val 


CCG 
Pro 


1736 


TAT 
Tyr 


ATC 
He 


GCC 

Ala 
85 


GlA 

val 


TTT 
Phe 


ni\ a 
GAA 

Glu 


GA1 

Asp 


GAX 

ASP 
90 


GTC 
Val 


TTA 
Leu 


CTC 
Leu 


GGC 
Gly 


GAA 
Glu 
95 


GGC 
Gly 


GCG 
Ala 


GAG 
Glu 


1784 


CAG 

Gin 


TTC 
Phe 
100 


til 

Leu 


nnn 
Ala 


Glu 


GA1 

Asp 


At X 

Thr 
105 


loo 

Trp 


CTG 
Leu 


CAA 
Gin 


GAA 

Glu 


nnn 
tGt 

Arg 
110 


TTT 
Phe 


m\n 
GAC 

ASp 


CCC GAT 
Pro Asp 


1832 


1 tt 

Ser 
115 


err* 
Ala 


TTT 
Phe 


bit 

Val 


bit 

val 


nnn 
tot 

Arg 

120 


1 I O 

Leu 


bAR 

Glu 


ACG 
Thr 


ATG 
Met 


TTT 
Phe 
125 


TV TO 

A1G 

Met 


CAC 

His 


nn*n 
GxC 

val 


CTG 
Leu 


ACC 
Thr 
130 


1 D D A 

loo U 


ICG 

Ser 


nnn 
Pro 


Ser 


VjVjt 

Gly 


Val 
135 


nnn 

Ut>J 

Ala 


nun 

bAb 

Asp 


X At 

Tyr 


GGC GGG 
Gly Gly 
140 


nnn 
tot 

Arg 


nnn 
Ala 


TTT 
Phe 


nnn 
CCG 

Pro 


CTT 
Leu 
145 


TTG 
Leu 




GAA. 
Glu 


Ser 


GAA 
GlU 


CAC 

His 
150 


IGt 

Cys 


nnn 

GGG 

Gly 


7i nn 

AtG 

Thr 


GCG 
Ala 


GGC TAT 
Gly Tyr 
"155 


ATT 
He 


ATT 
He 


n*nn 

ICC 

Ser 


nnn 
CGA 

Arg 
160 


AAG GCG 
Lys Ala 


l97o 


ATG 

Met 


CGI 

Arg 


TTT 
Phe 
165 


TTC 
Phe 


TTG 
Leu 


nnn 

GAt 

Asp 


Aoo 

Arg 


TTT 
Phe 
170 


GCC 
Ala 


GTT 
Val 


1 1G 

Leu 


nnn 
ttG 

Pro 


nnn 

CCC 

Pro 

175 


nf\ ta 
GAA 

Glu 


CGC CTG 
Arg Leu 


O AO A 
Z\JZ<t 


VJAL 

His 


tt 1 

Pro 
180 


Val 


Asp 


llu 

Leu 


niu 

Met 


niu 

Met 
185 


lit 
Phe 


GGC AAC 
Gly Asn 


nnn* 
tt i 

Pro 


GAt 

Asp 
190 


nun 

GAt 

Asp 


a nn 

AGG 

Arg 


GAA GGA 
Glu Gly 


Zu / Z 


Met 
195 


nhn 

ttG 

Pro 


ol 1 

Val 


lot 

Cys 


Gin 


t X t 

Leu 
200 


7171T 

Asn 


ttt 
Pro 


GCC 
Ala 


TTG 
Leu 


*vnn 

lot 

Cys 
205 


cine* 
ott 

Ala 


r*7i 7\ 

tAA 

Gin 


ni\n 

GAG 

Glu 


CTG 
Leu 


CAT 
His 
210 


z xz u 


rpJVT 

Tyr 


nnn 
Ala 


Lys 


TTT 
Phe 


tnt 

His 
215 


Asp 


^"•7171 

Gin 


71 TIP 
nnb 

Asn 


AGC 
Ser 


GCA 
Ala 
220 


1 lu 

Leu 


cine 
Gly 


j\rzn 

nut 

Ser 


nTn 

t ib 

Leu 


ATC 
He 
225 


GAA 
Glu 


•51 CP 
zlbo 


CAT 
His 


GAC 
Asp 


CGC 
Arg 


CGC 
Arg 
230 


CTG 
Leu 


AAC 
Asn 


CGC 
Arg 


AAA 
Lys 


CAG 
Gin 
235 


CAA 
Gin 


TGG 
Trp 


CGC 
Arg 


GAT 
Asp 


TCC 
Ser 
240 


CCC 
Pro 


GCC 
Ala 


2216 


AAC 
Asn 


ACA 
Thr 


TTC 
Phe 
245 


AAA 
Lys 


CAC 
His 


CGC 
Arg 


CTG 
Leu 


ATC 
He 
250 


CGC GCC 
Arg- Ala 


TTG 
Leu 


ACC 
Thr 


AAA 
Lys 
255 


ATC 
He 


GGC AGG 
Gly Arg 


2264 


GAA 
Glu 


AGG 
Arg 
260 


GAA 
Glu 


AAA 
Lys 


CGC 
Arg 


CGG 
Arg 


CAA 
Gin 

265 


AGG CGC GAA 
Arg "Arg Glu 


CAG 
Gin 


TTA 
Leu 
270 


ATC 
He 


GGC 
Gly 


AAG 
Lys 


ATT 

He 


2312 



ATT GTG CCT TTC CAA TAAAAGGAGA AAAGATGGAC ATCGTATTTG CGGCAGACGA 2367 
He Val Pro Phe Gin 
275 280 

CAACTATGCC GCCTACCTTT GCGTTGCGGC AAAAAGCGTG GAAGCGGCCC ATCCCGATAC 2427 

GGAAATCAGG TTCCACGTCC TCGATGCCGG CATCAGTGAG GAAAACCGGG CGGCGGTTGC 2487 

CGCCAATTTG CGGGGGGGGG GTAATATCCG CTTTATAGAC GTAAACCCCG AAGATTTCGC 254 7 




CGGCTTCCCC TTAAACATCA GGCACATTTC 
CGAATACATT GCCGATTGCG ACAAAGTCCT 
CGGCCTGAAG CCCTTATGGG ATACCGATTT 
TTTGTTTGTC GAAAGGCAGG AAGGATACAA 
TTATTTCAAT GCCGGCGTAT TGCTGATCAA 
CAAAATGTCC TGCGAATGGG TGGAACAATA 
CATTTTGAAC GGGCTGTTTA AAGGCGGGGT 
GCCGACCAAT TATGCCTTTA TGGCGAACGG 
CCTCGACCGT ACCAATACGG CGATGCCCGT 
GCCGTGGCAC AGGGACTGCA CCGTTTGGGG 
CCTGACGACC GTTCCCGAAG AATGGCGCGG 
GCTTCAAAGA TGGCGCAAAA AGCTGTCTGC 
CAGGCCGTCT GAAGCCTTCA GACGGCATCG 
□ CCTTTAGTCA GCGTATTGAT TTGCGCCTAC 

2? GCCGCCGTAG TGGGGCAGAC TTGGCGCAAC 

Q 

VJ ACGGACGGCA CGCCCGCCAT TGCCCGGCAT 

fll ATTTCCAATC CCCGCAATTT GGGCTTTATC 

GCAAAGTCGG GGGGGGGGGA ATATATTGCG 
9 GGCTGGATTG AGAAAATCGT GGGCGAGATG 

pj GCGTGGTTGG AAGTTTTGTC GGAAGAAAAC 

P AACGGCGCAA TTTGGGACAA ACCGACCCGG 

GGCAACCCCA TACACAACAA CACGATGATT 

y 

jU CGGTTCGATC CAGCCTATAT CCACGCCGAA 

CTGGGCAGGC TGGCTTATTA TCCCGAAGCC 
ACTTCTTCCA AATACAACCT GCAACAGCGC 
AGGGCGGGGT ATTGGAAGGC GGCAGGCATA 
CTTTTGAAAT CAACGGCATA TGCGTTGTAC 
TGCCTCCGCC TGTTCCTGTA CGAATATTTC 
TTGCTGGATT TCTTGACAGA CCGCGTGATG 
AAAATCCTGA AAAAAATGTT ACGCCCTTGG 
TAAATCATGC AAAACCACGT TATCAGCTTG 
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CATTACGACT 


TATGCCCGCC 


TGAAATTGGG 


2607 


GTATCTGGAT 


ACGGACGTAT 


TGGTCAGGGA 


2667 


GGGCGGTAAC 


TGGGTCGGCG 


CGTGCATCGA 


2727 


ACAAAAAATC 


GGTATGGCGG 


ACGGAGAATA 


2787 


CCTGAAAAAG 


TGGCGGCGGC 


ACGATATTTT 


2847 


CAAGGACGTG 


ATGCAATATC 


AGGATCAGGA 


2907 


GTGTTATGCG 


AACAGCCGTT 


TCAACTTTAT 


2967 


GTTTGCGTCC 


CGCCATACCG 


ACCCGCTTTA 


3027 


CGCCGTCAGC 


CATTATTGCG 


GCTCGGCAAA 


3087 


TGCGGAACGT 


TTCACAGAGT 


TGGCCGGCAG 


3147 


CAAACTTGCC 


GTCCCGCCGA 


CAAAGTGTAT 


3207 


CAGATTCTTA 


CGCAAGATTT 


ATTGACGGGG 


3267 


GACGTATCGG 


AAAGGAGAAA 


CGGATTGCAG 


3327 


AACGCAGAAA 


AATATTTTGC 


CCAATCATTG 


3387 


TTGGATATTT 


TGATTGTCGA 


TGACGGCTCG 


3447 


TTCCAAGAAC 


AGGACGGCAG 


GATCAGGATA 


3507 


GCCTCTTTAA 


ACATCGGGCT 


GGACGAATTG 


3567 


CGCACCGATG 


CCGACGATAT 


TGCCTCCCCC 


3627 


GAAAAAGACC 


GCAGCATCAT 


TGCGATGGGC 


3687 


AATAAAAGCG 


TGCTTGCCGC 


CATTGCCCGA 


3747 


CATGAAGACA 


TTGTCGCCGT 


TTTCCCTTTC 


3807 


ATGAGGCGCA 


GCGTCATTGA 


CGGCGGTTTG 


3867 


GACTATAAGT 


TTTGGTACGA 


AGCCGGCAAA 


3927 


TTGGTCAAAT 


ACCGCTTCCA 


TCAAGACCAG 


3987 


AGGACGGCGT 


GGAAAATCAA 


AGAAGAAATC 


4047 


GCCGTCGGGG 


CGGACTGCCT 


GAATTACGGG 


4107 


GAAAAAGCCT 


TGTCCGGACA 


GGATATCGGA 


4167 


TTGTCGTTGG 


AAAAGTATTC 


TTTGACCGAT 


4227 


AGGAAGCTGT 


TTGCCGCACC 


GCAATATAGG 


4287 


AAATACCGCA 


GCTATTGAAA 


CCGAACAGGA 


4347 


GCTTCCGCCG 


CAGAGCGCAG 


GGCGCACATT 


4407 
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GCCGATACCT 


TCGGCAGTCG 


CGGCATCCCG 


TTCCAGTTTT 


TCGACGCACT 


GATGCCGTCT 


4467 


GAAAGGCTGG 


AACAGGCGAT 


GGCGGAACTC 


GTCCCCGGCT 


TGTCGGCGCA 


CCCCTATTTG 


4527 


AGCGGAGTGG 


AAAAAGCCTG 


CTTTATGAGC 


CACGCCGTAT 


TGTGGGAACA 


GGCGTTGGAT 


4587 


GAAGGTCTGC 


CGTATATCGC 


CGTATTTGAG 


GACGACGTTT 


fTT TV /■^fTTy**(/" l 1 /T ^Ty - "* TA 

TACTCGGCGA 


TA yTJ^*T ^l^T ^T TV ^T 

AGGCGCGGAG 


4647 


CAGTTCCTTG 


/"ry^ TA TV /"tj % TV /T 

CCGAAGATAC 


TTGGTTGGAA 


GAGCGTTTTG 


TV TV TV ^T ^T) TA mfTT^t 

ACAAGGATTC 


CGCCTTTATC 


4707 


/■im/i/i/1 tv if infill 

GTCCGTTTGG 


TA TV TA TV CTT^T mm 

AAACGATGTT 


m/r ^"1/^1 TV TA TA ^lrrvn 

TGCGAAAGTT 


ATTGTCAGAC 


a tv nr tv Tv tv s^rm 

CGGATAAAGT 


CCTGAATTAT 


4767 


GAAAACCGGT 


CATTTCCTTT 


GCTGGAGAGC 


GAACATTGTG 


GGACGGCTGG 


CTATATCATT 


4827 


TCGCGTGAGG 


CGATGCGGTT 


TTTCTTGGAC 


AGGTTTGCCG 


TTTTGCCGCC 


AGAGCGGATT 


4887 


AAAGCGGTAG 


ta rr *f TTrn^T ta m/T ta rrr 

ATTTGATGAT 


GTTTACTTAT 


TTCTTTGATA 


AGGAGGGGAT 


GCCTGTTTAT 


4947 


CAGGTTAGTC 


CCGCCTTATG 


frn ts ^%^% Tv tv Tv ta 

TACCCAAGAA 


TTGCATTATG 


CCAAGTTTCT 


^TJ TA Z*lfTmTi TV TV TV /"* 

CAGTCAAAAC 


5007 


AGTATGTTGG 


GTAGCGATTT 


GGAAAAAGAT 


AGGGAACAAG 


GAAGAAGACA 


CCGCCGTTCG 


5067 


TTGAAGGTGA 


TGTTTGACTT 


GAAGCGTGCT 


TTGGGTAAAT 


TCGGTAGGGA 


AAAGAAGAAA 


5127 


• AGAATGGAGC 


GTCAAAGGCA 


GGCGGAGCTT 


«k *•+ TV TV TV ^'IfTltffTiTl 

GAGAAAGTTT 


TV ^T TV ^T ^T/T 

ACGGCAGGCG 


GGTCATATTG 


5187 


TTCAAATAGT 


TTGTGTAAAA 


TATAGGGGAT 


TAAAATCAGA 


AATGGACACA 


CTGTCATTCC 


5247 


CGCGCAGGCG 


/t**t T\ ta m/rm TV. 

GGAATCTAGG 


TCTTTAAACT 


TCGGTTTTTT 


CCGATAAATT 


CTTGCCGCAT 


5307 


TAAAATTCCA 


GATTCCCGCT 


TTCGCGGGGA 


TGACGGCGGG 


GGGATTGTTG 


CTTTTTCGGA 


5367 


m TV TV TV H mrt^^t^l 

TAAAATCCCG 


TGTTTTTTCA 


TCTGCTAGGT 


TV TV TV TV CTT^T./"T ^1 /T 

AAAATCGCCC 


TV TV TV yiy* fTT / till 

CAAAGCGTCT 


GCATCGCGGC 


5427 


GATGGCGGCG 


AGTGGGGCGG 


TTTCTGTGCG 


fTTTA TV TV Tl rtl/ , l/*l/*irn 

TAAAATCCGT 


fTlfTVfT 1 /I fT^I tv ^im^^ 

TTTCCGAGTG 


m TV TV /"I f »nn 

TAACCGCCTG 


5487 


AAAGCCGGCT 


TCAAATGCCT 


GTTGTTCTTC 


CTGTTCTGTC 


CAGCCGCCTT 


CGGGCCCGAC 


5547 


CATAAAGACG 


ATTGCGCCGG 


ACGGGTGGCG 


GATGTCGCCG 


AGTTTGCAGG 


CGCGGTTGAT 


5607 


GCTCATAATC 


AGCTTGGTGT 


TTTCAGACGG 


CATTTTGTCG 


AGTGCTTCAC 


GGTAGCCGAT 


5667 


GATGGGCAGT 


ACGGGGGGAA 


CGGTGTTCCT 


GCCGCTTTGT 


TCGCACGCGG 


AGATGACGAT 


5727 


TTCCTGCCAG 


CGTGCGAGGC 


GTTTGGCGGC 


GCGTTCTCCG 


TCGAGGCGGA 


CGATGCAGCG 


5787 


TTCGCTGATG 


ACGGGCTGTA 


TGGCGGTTAC 


GCCGAGTTCG 


ACGCTTTTTT 


GCAGGGTGAA 


5847 


ATCCATGCGA 


TC 










5859 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 279 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



Met Gin Asn His Val lie Ser Leu Ala 
1 5 



Ser Ala Ala Glu Arg Arg Ala 
10 15 



His He Ala Ala Thr Phe Gly Ser Arg 
20 25 



Gly He Pro Phe Gin Phe Phe 
30 



Asp Ala Leu Met Pro Ser Glu Arg Leu 
35 40 



Glu Arg Ala Met Ala Glu Leu 
45 



Val Pro Gly Leu Ser Ala His Pro Tyr 
50 55 



Leu Ser Gly Val Glu Lys Ala 
60 



Cys Phe Met Ser His Ala Val Leu Trp 
65 70 



Glu Gin Ala Leu Asp Glu Gly 
75 80 



Val Pro Tyr He Ala Val Phe Glu Asp 
85 



Asp Val Leu Leu Gly Glu Gly 
90 95 



Ala Glu Gin Phe Leu Ala Glu Asp Thr 
100 105 



Trp Leu Gin Glu Arg Phe Asp 
110 



Pro Asp Ser Ala Phe Val Val Arg Leu 
li5 120" 



Glu Thr Met Phe Met His Val 
125 



Leu Thr Ser Pro Ser Gly Val Ala Asp 
130 135 



Tyr Gly Gly Arg Ala Phe Pro 
140 



Leu Leu Glu Ser Glu His Cys Gly Thr 
145 150 



Ala Gly Tyr He He Ser Arg 
155 160 



Lys Ala Met Arg Phe Phe Leu Asp Arg 
165 



Phe Ala Val Leu Pro Pro Glu 
170 175 



Arg Leu His Pro Val Asp Leu Met Met 
180 185 



Phe Gly Asn Pro Asp Asp Arg 
190 



Glu Gly Met Pro Val Cys Gin Leu Asn 
195 200 



Pro Ala Leu Cys Ala Gin Glu 
205 



Leu His Tyr Ala Lys Phe His Asp Gin 
210 215 



Asn Ser Ala Leu Gly Ser Leu 
220 



He Glu His Asp Arg Arg Leu Asn Arg 
225 230 



Lys Gin Gin Trp Arg Asp Ser 
235 240 



Pro Ala Asn Thr Phe Lys His Arg Leu 
245 



He Arg Ala Leu Thr Lys He 
250 255 



Gly Arg Glu Arg Glu Lys Arg Arg Gin 
260 *265 



Arg Arg Glu Gin Leu He Gly 
270 



Lys He He Val Pro Phe Gin 
275 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: unknown 
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(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: PCR primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GCCGAGAAAA CTATTGGTGG A 21 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
• - - (D) TOPOLOGY: unknown 

" _ (ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

|p f b (iv) ANTI -SENSE: NO 

Pi 

"J" (vi) ORIGINAL SOURCE: 

tj (A) ORGANISM: PCR primer 

M 

ru 
m 

M 

|"IJ (A) LENGTH: 348 amino acids 

f f ] (B) TYPE: amino acid 

Q (D) TOPOLOGY: linear 

g;j (ii) MOLECULE TYPE: protein 

^ (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Leu Gin Pro Leu Val Ser Val Leu lie Cys Ala Tyr Asn Val Glu Lys 
1 5 m 10 15 

Tyr Phe Ala Gin Ser Leu Ala Ala Val Val Asn Gin Thr Trp Arg Asn 
20 .25 30 

Leu Asp lie Leu lie Val Asp Asp Gly Ser Thr Asp Gly Thr Leu Ala 
35 40 45 

lie Ala Lys Asp Phe Gin Lys Arg Asp Ser Arg lie Lys lie Leu Ala 
50 55 60 

Gin Ala Gin Asn Ser Gly Leu lie Pro Ser Leu Asn lie Gly Leu Asp 
65 70 75 80 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: • 
AAAACATGCA GGAATTGACG AT 22 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 
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Glu Leu Ala Lys Ser Gly Gly Gly Gly Gly Glu Tyr He Ala Arg Thr 
85 90 95 

Asp Ala Asp Asp He Ala Ser Pro Gly Trp He Glu Lys He Val Gly 
100 105 110 

Glu Met Glu Lys Asp Arg Ser He He Ala Met Gly Ala Trp Leu Glu 
115 120 125 

Val Leu Ser Glu Glu Lys Asp Gly Asn Arg Leu Ala Arg His His Lys 
130 135 140 

His Gly Lys He Trp Lys Lys Pro Thr Arg His Glu Asp lie Ala Ala 
145 150 155 160 

Phe Phe Pro Phe Gly Asn Pro He His Asn Asn Thr Met He Met Arg 
165 170 175 

Arg Ser Val He Asp Gly Gly Leu Arg Tyr Asp Thr Glu Arg Asp Trp 
' - * 180 185 190 

Ala Glu Asp Tyr Gin Phe Trp Tyr Asp Val Ser Lys Leu Gly Arg Leu 
195 200 ' 205 

U, Ala Tyr Tyr Pro Glu Ala Leu Val Lys Tyr Arg Leu His Ala Asn Gin 

^ 210 215 220 

JIJ Val Ser Ser Lys His Ser Val Arg Gin His Glu He Ala Gin Gly He 

225 230 235 240 

\j Gin Lys Thr Ala Arg Asn Asp Phe Leu Gin Ser Met Gly Phe Lys Thr 

flj 245 250 255 

t»n Arg Phe Asp Ser Leu Glu Tyj. Arg Gln Thr Lys Ala Ala Ala Tyr Glu 

260 265 270 

Leu Pro Glu Lys Asp Leu Pro Glu Glu Asp Phe Glu Arg Ala Arg Arg 
275 280 285 



Phe Leu Tyr Gin Cys Phe Lys Arg Thr Asp Thr Pro Pro Ser Gly Ala 
290 295 300 



ru 
o 
u 

f»j Trp Leu Asp Phe Ala Ala Asp Gly Arg Met Arg Arg Leu Phe Thr Leu 

,\ 305 310 315 ~ 320 

Arg Gin Tyr Phe Gly He Leu Tyr Arg* Leu He Lys Asn Arg Arg Gin 
325 330 335 

Ala Arg Ser Asp Ser Ala Gly Lys Glu Gin Glu He 
340 ~345 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 337 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



• Lm2s I 111 
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Leu Gin Pro Leu Val Ser Val Leu lie Cys Ala Tyr Asn Ala Glu Lys 
15 10 15 

Tyr Phe Ala Gin Ser Leu Ala Ala Val Val Gly Gin Thr Trp Arg Asn 
20 25 30 

Leu Asp lie Leu lie Val Asp Asp Gly Ser Thr Asp Gly Thr Pro Ala 
35 40 45 

He Ala Arg His Phe Gin Glu Gin Asp Gly Arg He Arg He He Ser 
50 55 60 

Asn Pro Arg Asn Leu Gly Phe He Ala Ser Leu Asn He Gly Leu Asp 
65 70 75 80 

Glu Leu Ala Lys Ser Gly Gly Gly Glu Tyr He Ala Arg Thr Asp Ala 
85 90 95 

Asp Asp He Ala Ser Pro Gly Trp He Glu Lys He Val Gly Glu Met 
- - - 100 105 110 

Glu Lys Asp Arg Ser He He Ala Met Gly Ala Trp Leu Glu Val Leu 
lis 120' 125 

Ser Glu Glu Asn Asn Lys Ser Val Leu Ala Ala He Ala Arg Asn Gly 

M* 130 135 140 



□ 

0 

a 

M 



Ala He Trp Asp Lys Pro Thr Arg His Glu Asp He Val Ala Val Phe 
145 150 155 160 

Pro Phe Gly Asn Pro He His Asn Asn Thr Met He Met Arg Arg Ser 
165 170 175 



fl] Val He Asp Gly Gly Leu Arg Phe Asp Pro Ala Tyr He His Ala Glu 

*, 0 180 185 190 

s Asp Tyr Lys Phe Trp Tyr Glu Ala Gly Lys Leu Gly Arg Leu Ala Tyr 

"5 200 205 

Hi Tyr Pro Glu Ala Leu Val Lys Tyr Arg Phe His Gin Asp Gin Thr Ser 

P 210 215 220 

Ser Lys Tyr Asn Leu Gin Gin Arg Arg Thr Ala Trp Lys He Lys Glu 
0 225 230 235 240 

Glu He Arg Ala Gly Tyr Trp Lys Ala -Ala Gly He Ala Val Gly Ala 
245 250 255 

Asp Cys Leu Asn Tyr Gly Leu Leu Lys Ser Thr Ala Tyr Ala Leu Tyr 
260 "265 270 

Glu Lys Ala Leu Ser Gly Gin Asp He Gly Cys Leu Arg Leu Phe Leu 
275 280 285 

Tyr Glu Tyr Phe Leu Ser Leu Glu Lys Tyr Ser Leu Thr Asp Leu Leu 
290 295 * 300 

Asp Phe Leu Thr Asp Arg Val Met Arg Lys Leu Phe Ala Ala Pro Gin 
305 310 315 320 

Tyr Arg Lys He Leu Lys Lys Met Leu Arg Pro Trp Lys Tyr Arg Ser 
325 330 335 
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